机器学习和计算机视觉是动态增长的领域,事实证明,它们能够解决非常复杂的任务。它们也可以用于监测蜜蜂菌落和检查其健康状态,在这种情况至关重要之前,可以确定潜在的危险状态,或者更好地计划定期的蜜蜂殖民地检查,从而节省大量费用。在本文中,我们介绍了用于蜜蜂监视的最先进的计算机视觉和机器学习应用程序。我们还证明了这些方法的潜力,作为自动蜜蜂计数器算法的一个例子。该论文针对的是兽医和养育专业人士和专家,他们可能不熟悉机器学习来向他们介绍其可能性,因此,每个应用程序都通过与基本方法相关的简短理论介绍和动机来打开。我们希望本文能够激发其他科学家将机器学习技术用于蜜蜂监测中的其他应用。
translated by 谷歌翻译
Diffusion models have shown a great ability at bridging the performance gap between predictive and generative approaches for speech enhancement. We have shown that they may even outperform their predictive counterparts for non-additive corruption types or when they are evaluated on mismatched conditions. However, diffusion models suffer from a high computational burden, mainly as they require to run a neural network for each reverse diffusion step, whereas predictive approaches only require one pass. As diffusion models are generative approaches they may also produce vocalizing and breathing artifacts in adverse conditions. In comparison, in such difficult scenarios, predictive models typically do not produce such artifacts but tend to distort the target speech instead, thereby degrading the speech quality. In this work, we present a stochastic regeneration approach where an estimate given by a predictive model is provided as a guide for further diffusion. We show that the proposed approach uses the predictive model to remove the vocalizing and breathing artifacts while producing very high quality samples thanks to the diffusion model, even in adverse conditions. We further show that this approach enables to use lighter sampling schemes with fewer diffusion steps without sacrificing quality, thus lifting the computational burden by an order of magnitude. Source code and audio examples are available online (https://uhh.de/inf-sp-storm).
translated by 谷歌翻译
We outline our work on evaluating robots that assist older adults by engaging with them through multiple modalities that include physical interaction. Our thesis is that to increase the effectiveness of assistive robots: 1) robots need to understand and effect multimodal actions, 2) robots should not only react to the human, they need to take the initiative and lead the task when it is necessary. We start by briefly introducing our proposed framework for multimodal interaction and then describe two different experiments with the actual robots. In the first experiment, a Baxter robot helps a human find and locate an object using the Multimodal Interaction Manager (MIM) framework. In the second experiment, a NAO robot is used in the same task, however, the roles of the robot and the human are reversed. We discuss the evaluation methods that were used in these experiments, including different metrics employed to characterize the performance of the robot in each case. We conclude by providing our perspective on the challenges and opportunities for the evaluation of assistive robots for older adults in realistic settings.
translated by 谷歌翻译
Nucleolar organizer regions (NORs) are parts of the DNA that are involved in RNA transcription. Due to the silver affinity of associated proteins, argyrophilic NORs (AgNORs) can be visualized using silver-based staining. The average number of AgNORs per nucleus has been shown to be a prognostic factor for predicting the outcome of many tumors. Since manual detection of AgNORs is laborious, automation is of high interest. We present a deep learning-based pipeline for automatically determining the AgNOR-score from histopathological sections. An additional annotation experiment was conducted with six pathologists to provide an independent performance evaluation of our approach. Across all raters and images, we found a mean squared error of 0.054 between the AgNOR- scores of the experts and those of the model, indicating that our approach offers performance comparable to humans.
translated by 谷歌翻译
We present a unified probabilistic model that learns a representative set of discrete vehicle actions and predicts the probability of each action given a particular scenario. Our model also enables us to estimate the distribution over continuous trajectories conditioned on a scenario, representing what each discrete action would look like if executed in that scenario. While our primary objective is to learn representative action sets, these capabilities combine to produce accurate multimodal trajectory predictions as a byproduct. Although our learned action representations closely resemble semantically meaningful categories (e.g., "go straight", "turn left", etc.), our method is entirely self-supervised and does not utilize any manually generated labels or categories. Our method builds upon recent advances in variational inference and deep unsupervised clustering, resulting in full distribution estimates based on deterministic model evaluations.
translated by 谷歌翻译
Automatically fixing software bugs is a challenging task. While recent work showed that natural language context is useful in guiding bug-fixing models, the approach required prompting developers to provide this context, which was simulated through commit messages written after the bug-fixing code changes were made. We instead propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for any additional information from developers. For this, we augment standard bug-fixing datasets with bug report discussions. Using these newly compiled datasets, we demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
translated by 谷歌翻译
Diffusion-based generative models have had a high impact on the computer vision and speech processing communities these past years. Besides data generation tasks, they have also been employed for data restoration tasks like speech enhancement and dereverberation. While discriminative models have traditionally been argued to be more powerful e.g. for speech enhancement, generative diffusion approaches have recently been shown to narrow this performance gap considerably. In this paper, we systematically compare the performance of generative diffusion models and discriminative approaches on different speech restoration tasks. For this, we extend our prior contributions on diffusion-based speech enhancement in the complex time-frequency domain to the task of bandwith extension. We then compare it to a discriminatively trained neural network with the same network architecture on three restoration tasks, namely speech denoising, dereverberation and bandwidth extension. We observe that the generative approach performs globally better than its discriminative counterpart on all tasks, with the strongest benefit for non-additive distortion models, like in dereverberation and bandwidth extension. Code and audio examples can be found online at https://uhh.de/inf-sp-sgmsemultitask
translated by 谷歌翻译
最近一年带来了电动汽车(EV)和相关基础设施/通信的大幅进步。入侵检测系统(ID)被广泛部署在此类关键基础架构中的异常检测。本文提出了一个可解释的异常检测系统(RX-ADS),用于在电动汽车中的CAN协议中进行入侵检测。贡献包括:1)基于窗口的特征提取方法; 2)基于深度自动编码器的异常检测方法; 3)基于对抗机器学习的解释生成方法。在两个基准CAN数据集上测试了提出的方法:OTID和汽车黑客。将RX-ADS的异常检测性能与这些数据集的最新方法进行了比较:HID和GID。 RX-ADS方法提出的性能与HIDS方法(OTIDS数据集)相当,并且具有超出HID和GID方法(CAR HACKING DATASET)的表现。此外,所提出的方法能够为因各种侵入而引起的异常行为产生解释。这些解释后来通过域专家使用的信息来检测异常来验证。 RX-ADS的其他优点包括:1)该方法可以在未标记的数据上进行培训; 2)解释有助于专家理解异常和根课程分析,并有助于AI模型调试和诊断,最终改善了对AI系统的用户信任。
translated by 谷歌翻译
自然语言处理(NLP)是一个人工智能领域,它应用信息技术来处理人类语言,在一定程度上理解并在各种应用中使用它。在过去的几年中,该领域已经迅速发展,现在采用了深层神经网络的现代变体来从大型文本语料库中提取相关模式。这项工作的主要目的是调查NLP在药理学领域的最新使用。正如我们的工作所表明的那样,NLP是药理学高度相关的信息提取和处理方法。它已被广泛使用,从智能搜索到成千上万的医疗文件到在社交媒体中找到对抗性药物相互作用的痕迹。我们将覆盖范围分为五个类别,以调查现代NLP方法论,常见的任务,相关的文本数据,知识库和有用的编程库。我们将这五个类别分为适当的子类别,描述其主要属性和想法,并以表格形式进行总结。最终的调查介绍了该领域的全面概述,对从业者和感兴趣的观察者有用。
translated by 谷歌翻译
微型机构中的一个重要问题是如何控制具有全局控制信号的大型微型机器人。本文着重于控制大规模的微棉机器人,并使用车载物理有限状态机器来控制。我们介绍了基于组的控制的概念,这使得可以扩大群体大小,同时降低机器人制造和群体控制的复杂性。我们证明,基于组的控制系统可以从机器人位置上进行本地访问。我们进一步基于广泛的模拟,即该系统在全球可控。提出了一种非线性优化策略,以最大程度地减少控制努力来控制群体。我们还提出了一种适合在线使用的概率完整的避免碰撞方法。本文以对模拟中提出的方法的评估结束。
translated by 谷歌翻译